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Abstract 

Reusable software components need well-defined interfaces, rigorously and 
completely documented features, and a design amenable both to reuse and to for- 
mal verification; all these requirements call for expressive specifications. This 
paper outlines a rigorous foundation to model-based contracts, a methodology to 
equip classes with expressive contracts supporting the accurate design, implemen- 
tation, and formal verification of reusable components. Model-based contracts 
conservatively extend the classic Design by Contract by means of expressive mod- 
els based on mathematical notions, which underpin the precise definitions of no- 
tions such as abstract equivalence and specification completeness. Preliminary 
experiments applying model-based contracts to libraries of data structures demon- 
strate the versatility of the methodology and suggest that it can introduce rigorous 
notions, but still intuitive and natural to use in practice. 



1 Introduction 

The case for precise software specifications involves several well-known arguments; in 
particular, specifications help understand the problem before building a solution, and 
they are necessary for verifying implementations. In the case of a library of reusable 
software components, precise specifications have another application, essential to the 
effective use of the library: providing client programmers with a description of the 
interface (the API). To help produce such specifications. Design by Contract techniques 
flSJ let authors of reusable modules equip them with specification elements known 
as "contracts" (routine preconditions and postconditions, class invariants), which tools 
from the development environment can extract to produce automatically generated API 
documentation. 

While specifications primarily intended for purposes other than component devel- 
opment typically use a specification language based on mathematics, approaches using 
Design by Contract, such as Eiffel |18|, JML [17] and Spec# \2d rely instead on an 
assertion language embedded in the programming language. In Eiffel, for example, 
contracts are expressed through assertions built out of the languages Boolean expres- 
sions, with a few extensions; the most notable of these extensions is the old notation 
which makes it possible to express postconditions as properties of both the starting and 
ending states of the computation. This approach adds a significant element to the list 
of benefits of precise specifications: being expressed in the programming language, 
contracts can be evaluated during execution. (We will use the term "executable asser- 
tions", although this is really about evaluation rather than execution; another possible 
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term is "embedded" assertion, to emphasize that the assertion language is included in 
the programming language.) As a consequence, contracts have played a major role in 
testing, especially for Eiffel, where an advanced testing environment, AutoTest fl9\ . 
takes advantage of contracts for automatic test generation; more generally, Eiffel pro- 
grammers routinely rely on run-time contract evaluation for testing and debugging. 

Another practical benefit of the approach is teachability: programmers already un- 
derstand Boolean expressions, and do not need to learn a separate specification lan- 
guage. These practical advantages of executable assertions have traditionally come 
at a price: expressiveness. Unlike a full-fledged specification language (such as B 
([l I, based on set theory), an assertion language embedded in a programming language 
makes it harder to express the full specification of programs and components. As a typ- 
ical example, the postcondition of a "push" operation on a stack in the existing standard 
Eiffel library expresses that the new top of the stack will be the item just pushed, and 
that the number of items will have been increased by one; but it typically does not state, 
except in the form of a comment, that the other elements of the stack are unaffected. 
This example is typical: an extensive study 111 indicates that in practice Eiffel classes 
contain many contracts, but (see also 1211 ) they cover only part of the programmers 
informal understanding of the specification. 

Can we retain all the advanced benefits of specifications, in particular support com- 
pleteness of specifications and static checks (including proofs), while retaining an ex- 
ecutable specification language that can also be used for testing? The present work 
proposes a positive answer, based on the idea of models. 

Specifications, in this approach, do not require any special language beyond the 
classical assertion language embedded in the programming language. Instead, they 
rely on a methodological principle: associate with every class one or more model 
queries specifying the semantics of the associated objects through standard mathemat- 
ical concepts, represented by instances of model classes. The model classes are also 
expressed in the programming language, but they are just direct translations of math- 
ematical concepts (such as sets, functions, relations etc.); they have no operational 
properties (attributes (fields), assignment, side effects, procedures and such), so that 
the corresponding objects are immutable. The model queries of a normal (non-model) 
class are expressed in terms of such model classes; for example a stack class can have 
a model query sequence of the model type SEQUENCE, associating a sequence with every 
stack (the sequence of stack items, starting for example from the top). It is then pos- 
sible to specify operations of the class through their effect on the model queries; for 
example the push operations yields a new stack whose sequence query yields a sequence 
starting with the element being pushed and continuing with the elements of the origi- 
nal sequence. In this example the class only has one model query (sequence), but any 
number of model queries is possible; the model queries can be existing features of the 
class, or new features added for the sole purpose of specification. 

This idea of model-based contracts is not new; previous own work 1241 123]| and, 
among others, JML [ 17 1 introduced the concept and provided libraries of model classes. 
Developing a rigorous and systematic approach to model-based specifications is the 
main contribution of the present paper Section|3]shows how the interface of a class de- 
fines unambiguously a notion of abstract space, which in turn determines the model of 
the class; programmers can easily introduce model classes and model queries in accor- 
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dance with this model. Section[3]also outhnes precise guidelines to write contracts that 
refer to the chosen model queries. The guidelines come with a definition of complete- 
ness of the postcondition of a feature with respect to the class model. The definition is 
formal, yet amenable to informal reasoning; it is practically useful in assessing whether 
a contract is sufficiently detailed or is likely omitting some important details of what 
the feature achieves. 

Section |4] describes two case studies that used this methodology for model -based 
specifications to develop libraries of data structures with strong contracts. The results 
achieved show that the methodology is successful in delivering well-designed com- 
ponents with expressive — usually complete — specifications. Most advantages of 
standard Design by Contract are retained, such as congeniality to programmers and 
ease of reasoning, while pushing a more accurate evaluation of design choices and an 
impeccable definition of interfaces. The executability of most model classes even sup- 
ports the reuse of Eiffel's automated contract-based testing infrastructure with more 
expressive contracts, which boosts the effectiveness of automated testing in finding 
defects in developed software. 

2 Motivation and overview 

Design by Contract (DbC) is a discipline of analysis, design, implementation, and man- 
agement of software. It relies on the fundamental idea of defining the role of any com- 
ponent in the system in terms of a contract that formalizes the obligations and benefits 
of that component relative to the rest of the system. Concretely, the contract is as a 
collection of assertions {preconditions, postconditions, and invariants) that constitute 
the module's specification. 

2.1 Some limitations of Design by Contract 

To emphasize the seamless connection that must exist between specification and im- 
plementation, and to make writing contracts palatable to the programmer, DbC uses 
the same notation for expressions in the implementation and in the specification. This 
choice successfully encourages programmers to write contracts 1 3 1 . On the other hand, 
it also restricts the assertions that can be expressed — or that can be expressed easily. 
This restriction ultimately impedes the formalization and verification of full functional 
correctness and even limits the scope of application of DbC for the correct design of an 
implementation. Let us demonstrate this on a couple of examples from the EiffelBase 
hbrary 

Lines 1-14 in Table [T] show a portion of class LiNKEDilST, implementing a dy- 
namic list. Features (members) count and index record respectively the number of ele- 
ments stored in the list and the current position of the internal cursor. Routine put.right 
inserts an element v to the right of the current position of the cursor, without moving 
it. The postcondition of the routine (clause ensure) asserts that inserting an element 
increments counter by one but does not change index. This is correct, but it does not 
capture the gist of the semantics of insertion: the list after insertion is obtained by all 
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2 class LINKEDXIST [G] 

3 count: INTEGER Number of elements 



duplicate (n: INTEGER): LINKEDXIST 

Copy of sublist of length 'n' beginning at current position 

require n > do ... ensure Result. mt/ex = end 



17 



4 



5 index: INTEGER Current cursor position 



18 



7 put^right (v: G) 

8 Add 'v' to the right of cursor. 

9 require < index < count 



19 end 

20 



10 
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ensure 

count = old count + I 
index = old index 
end 



do . . . 



21 class TABLE [G, K] 

22 put (v: C ;k:K) 

23 Associate value 'v' with key 'k'. 

24 require validjcey (k) 

25 deferred end 



12 



13 



26 

27 end 
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Table 1: Snippets from the EiffelBase classes linkedxist (lines 1-17) and table (lines 
19-25). 

the elements that were in the list up to position index, followed by element v and then by 
all elements that were to the right of index. 

Expressing such complex facts is impossible or exceedingly complicated with the 
standard assertion language; as a result most specifications are incomplete in the sense 
that they fail to capture precisely the functional semantics of routines. Weak specifi- 
cations hinder formal verification in two ways. First, establishing weak postconditions 
is simple, but confidence in the full functional correctness of a verified routine will 
be low: the quality of specifications limits the value of verification. Second, weak 
contracts affect negatively verification modularity: it is impossible to establish what a 
routine r achieves, if r calls another routine s whose contract is not strong enough to 
document its effect within r precisely. 

Weak assertions limit the potential of many other appUcations of DbC. Specifica- 
tions, for example, should document the abstract semantics of operations in deferred 
classes (classes without an implementation). Weak contracts cannot fully do so; as a 
result, programmers have fewer safeguards to prevent inconsistencies in the design and 
fewer chances to make deferred classes useful to clients through polymorphism and 
dynamic dispatching. 

Feature put in class TABLE (lines 16-19 in Table [U is an example of such a phe- 
nomenon. It is unclear how to express the abstract semantics of put with standard 
contracts. In particular, the absence of a postcondition leaves it undefined what should 
happen when an element is inserted with a key that is already associated to some other 
element: should put replace the previous element with the new one or cancel the inser- 
tion of the new element? Indeed, some heirs of TABLE implement put with a replace- 
ment semantics (such as class ARRAY), while others disallow overriding of preexisting 
mappings with put (such as class HASH.TABLE). Some classes (including HASH.TABLE) 
even introduce another feature force that implements the replacement semantics. This 
obscures the behavior of routines to clients and makes it questionable whether put has 
been introduced at the right point in the inheritance hierarchy. 
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2 note model: sequence, index 

3 class LINKEDXIST [G] 

4 sequence: MML.SEQUENCE [G] 

5 Sequence of elements 

6 do . . . end 

7 

8 count: INTEGER Number of elements 

9 ensure Result = sequence. count end 

10 

1 1 index: INTEGER Cunent cursor position 

12 

13 put^right (v: G) 

14 Add 'v' to the right of cursor. 

15 require < index < count 

16 do . . . 

17 ensure 

18 sequence = old ( sequence.front (index) 

19 .extended (v) + sequence.tail (index + 1) ) 

20 index = old index 

21 end 

22 end 



Table 2: Classes linkedxist (left) and table (right) with model-based contracts. 

2.2 Enhancing Design by Contract with models 

This paper presents an extension of DbC that addresses the aforementioned problems. 
The extension conservatively enhances DbC with model classes: immutable classes 
representing mathematical concepts that provide for more expressive specifications. 
Wrapping mathematical entities with classes supports richer contracts without need 
to extend the notation, which remains the one familiar to programmers as in DbC. 
Contracts using model classes are called model-based contracts. 

Table |2] shows an extensions of the examples in Table [T] with model-based con- 
tracts. LiNKEDilST is augmented with a query sequence that returns an instance of class 
MML.SEQUENCE, a model class representing a mathematical sequence of elements of 
homogeneous type; the implementation, omitted for brevity, builds sequence according 
to the actual content of the list. The meta-annotation note declares the two features 
sequence and index as model of the class; every contract will rely on the abstraction they 
provide. In particular, the postcondition of put.right can precisely describe the effect of 
the routine: the new sequence is the concatenation of the old sequence up to index, extended 
with element v, with the tail of the old sequence starting after index. We can assert that 
the new postcondition — including the clause about index — is complete with respect 
to the model of the class, because it completely defines the effect of put.right on the 
abstract model. This notion of completeness is a powerful guide to writing accurate 
specification that makes for well-defined interfaces and verifiable classes. 

The mathematical notion of a map — encapsulated by the model class MML31AP — 
is the natural model for the class table. Feature map cannot have an implementation 
yet, because TABLE is deferred and hence it is not committed to any representation of 
data. Nonetheless, the mere availability of a model class supports complex specifica- 
tions already at this abstract level. In particular, writing a complete postcondition for 
routine put requires to commit to a specific semantics for insertion. The example in 



24 note model: map 

25 class TABLE [G, K] 

26 map: MML31AP [G, K] 

27 Map of keys to values 

28 deferred end 

29 

30 put (v: C ^k: K) 

31 Associate value 'v' with key 'k'. 

32 require map.domain [k] 

33 deferred 

34 ensure 

35 nmp = old map.replacedMl (k, v) 

36 end 

37 end 
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Table |2] chooses the replacement semantics; correspondingly, all heirs of TABLE will 
have to conform to this semantics, guaranteeing a coherent reuse of TABLE throughout 
the class hierarchy. 

3 Foundations of model-based contracts 
3.1 Specifying classes with models 

This subsection describes a rigorous approach to equipping classes with expressive 
contracts. 

3.1.1 Interfaces, references, and objects. 

The definitions of abstract objects and models (introduced in the remainder) rely on 
the following simple assumptions about classes. A class C denotes a collection of 
objects. Expressions such as o : C define o as a reference to an object of class C; the 
notation is overloaded for conciseness, so that occurrences of o can denote the object 
it references or the reference itself, according to the context. Each class C defines 
a notion of reference equality =c and of object equality =c', both are equivalence 
relations. Two objects 01,02 : C of class C can be reference equal (written oi =c 02) 
or object equal (written oi 02). Reference equality is meant to capture whether 
oi and 02 are aliases for the same physical object, whereas object equality is meant 
to hold for (possibly) physically distinct objects with the same actual content. The 
following discussion is however independent of the particular choice of reference and 
object equality. 

The principle of information hiding prescribes that each class define an interface: 
the set of its publicly accessible features 1 18|. It is good practice to partition features 
into queries and commands; queries are functions of the object state, whereas com- 
mands modify the object state but do not return any value. Ic = Qc U Mc denotes the 
interface of a class C partitioned in queries Qc and commands AfcQ It is convenient 
to partition all queries into value-bound queries Qc ™^ reference-bound queries Q^- 
Value-bound queries should create fresh objects to return (or more generally objects 
that were unknown to the client before calling the query), whereas reference-bound 
queries give the client direct access, through a reference, to parts of the target object 
or of the query arguments. In other words, clients of a value-bound query are insensi- 
tive to whether they received a unique fresh object or they are just sharing a reference 
to a previously existing one. The chosen partitioning between value-bound and refer- 
ence bound queries does not affect the following discussion, although it is usually quite 
natural to adhere to this informal distinction when designing a class. 

Example 1. Query item (Table O is reference-bound, as the client receives the very 
same physical object that was earlier inserted in the list. Query duplicate (Table O is 
instead value-bound, as it returns a copy of a portion of the list. 

' Constructors need no special treatment and can be modeled as queries returning new objects. 
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The classification in value-bound and reference-bound extends naturally to argu- 
ments of features: if the feature does not rely on having a direct reference to the actual 
argument (as opposed to a copy of it), the argument is value-bound; otherwise, it is 
reference-bound. 

3.1.2 Abstract object space. 

The interface Ic induces an equivalence relation over objects of class C called ab- 
stract equality and defined as follows: oi 02 holds for 01,02 : C iff for any appli- 
cable sequence of calls to commands mi , m2, ... £ and a query q e Qc returning 
objects of some class T, the qualified calls oi.mi; 0i.m2; • • • and 02.TO1; 02-m2 \ ■ ■ ■ 
(with identical actual arguments where appropriate) drive oi and 02 in states such that 
if (7 is reference-bound then oi.q =t 02.q, and if 5 is value-bound then oi.q =t 02.(7. 
Intuitively, two objects are equivalent with respect to if a client cannot distinguish 
them by any sequence of calls to public features. 

Abstract equality defines an abstract object space: the quotient set Ac ^ C/ 
of C (as a set of objects) by x^. As a consequence, two objects are equivalent w.r.t. 

iff they have the same abstract (object) state. Any concrete set that is isomorphic 
to Ac is called a model of C. 

Example 2. A queue class typically consists of the queries item, count, and empty — 
returning the next element to be dequeued, the total number of elements in the queue, 
and a fresh empty queue — and the commands put and remove — to enqueue an element 
and dequeue the next element. If remove were not part of the interface, any element in 
the queue but the least recently inserted one would be inaccessible to clients; the model 
of such a class would then be a pair of type IN x G recording the current number of 
elements and the latest enqueued element of generic type G. Including remove in the 
interface, as it usually is the case for queues, allows clients to read the whole sequence 
of enqueued elements. Hence, two queues with full interfaces are indistinguishable iff 
they have the very same sequence of elements; the model of a queue class with full 
interface is then an abstract sequence of type G* . 

As all the following examples will suggest, the most natural design choice imple- 
ments object equality to have the same semantics as abstract equality. Notice, however, 
that complying or not with this rule of thumb does not affect the soundness of the defi- 
nitions in the present paper, nor does introduce circularities in the definition of abstract 
equaUty. 

3.1.3 Model classes. 

The model of a class G is expressed as a collection Dc = D^,D^, 
. . . , of model classes^ Model classes are immutable classes designed for spec- 
ification purposes; essentially, they are wrappers of rigorously defined mathematical 
entities: elementary sorts such as Booleans, integers, and object references, as well 
as more complex structures such as sets, bags, relations, maps, and sequences. The 

^The model may include the same class multiple times 
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MML library f23l provides a variety of such model classes, equipped with features 
that correspond to common operations on the mathematical structure they represent, 
including first-order quantification. For example, class MML.SET models sets of ele- 
ments of homogeneous type; it includes features for operations such as membership 
and quantification over all elements of the set that satisfy a certain predicate (passed as 
a function object). 

Example 3. As we discussed in Example|2] a sequence is a suitable model for a queue; 
it can be represented by class MML.SEQUENCE. To represent the model of a Unked 
list with internal cursor, we can combine a sequence of class MML.SEQUENCE with an 
element of class INTEGER to represent the position of the cursor; this assumes that no 
information about the pointer structure of the list in the heap is accessible through the 
interface of the class. 

3.1.4 Model queries. 

Every class C provides a collection of public model queries Sq = 5^, s^, . . . , s^, one 
for each component model class in Dc- Each model query sj^ returns an instance of 
the corresponding model class Dp that represents the current value of the i-th com- 
ponent of the model. (Informally, the values returned by model queries are analogues 
to the coefficients expressing the abstract state as a combination of independent ba- 
sis vectors spanning the whole space). Since the abstract object state should always 
be defined between operations and should not depend on the state of any other ob- 
ject, model queries are typically argumentless and without precondition. Clauses in 
the class invariant can constrain the values of the model queries to match precisely 
the abstract states of the model. For example, model query index: INTEGER returning 
the cursor position of the LiNKEDilST in Table [T] should be constrained by an invari- 
ant clause < index < sequence .count +1. A meta-annotation note model: s^, s^, . . . lists all 

model queries of the class (see Table|2]for an example). 

Programmers can add model queries incrementally to classes developed with DbC. 
In fact, it is likely that some model queries are already used in the implementation 
before models are added explicitly; for example feature index of class LiNKEDilST (Ta- 
ble |2]). Additional model queries return the remaining components of the model for 

specification purposes, such as sequence in LINKEDilST. 

Our approach prefers to implement new model queries as functions rather than 
attributes. This choice facilitates a purely descriptive usage of references to model 
queries in specifications. In other words, instead of augmenting routine bodies with 
bookkeeping instructions that update model attributes, routine postconditions are ex- 
tended with clauses that describe the new value returned by model queries in terms of 
the old one. This has the advantage of enforcing a cleaner division between implemen- 
tation and specification, while better modularizing the latter at routine level (properties 
of model attributes are typically gathered in the class invariant). A meta-annotation of 
the form note specification tags model queries that are not meant for use in implementa- 
tion; runtime checking of annotations calUng these model queries can be disabled if 
performance is a concern. 
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35 note model: sequence, index 

36 class LINKEDXIST [G] 

37 . . . 

38 has (v: G): BOOLEAN 

39 Does list include V? (Reference equality) 

40 do . . . 

41 ensure Result iff sequence. has (v) end 

42 

43 item: G 

44 Value at cursor position 

45 require 

46 sequence.domain [index] 

47 ensure 

48 Result = sequence [index] 

49 end 



Table 3: Snippets of class linkedxist with model -based contracts (continued from Ta- 
ble|2l). 

3.1.5 Model-based contracts. 

Let C be a class equipped with model queries whose interface Ic is partitioned into 
queries Qc and commands Ale- Qc now includes the model queries Sc Q Qc 
together with other queries Re — Qc \ Sc (note that this does not change the abstract 
space according to the definitions given at the beginning of the section). Queries in 
Rc are called standard queries. The rest of the section contains guidelines to writing 
model-based contracts for commands in Mc and queries in Rc- 

• The precondition of a feature is a constraint on the abstract values of its value- 
bound arguments and, possibly, on the actual references to its reference-bound 
arguments. The target object, in particular, can be considered an implicit value- 
bound argument. For example, the precondition map.domain [k] of feature put in 
class TABLE (Table |2]l, refers to the abstract state of the target object, given by 
the model query map, and to its actual reference-bound argument k. 

• Postconditions should refer to abstract states only through model queries. This 
emphasizes the components of the abstract state that a feature modifies or relies 
upon, which in turn facilitates understanding and reasoning on the semantics of 
a feature. 

• The postcondition of a command defines a relation between the prestate and 
the poststate of its arguments and the target object; prestate and poststate refer 
respectively to the state before and after executing the command. More precisely, 
the postcondition mentions only abstract values of its value-bound arguments and 
possibly the actual references to its reference-bound arguments; the target object 
is considered value-bound both in the prestate and in the poststate. 

It is common that a command only affects a few components of the abstract state 
and leaves all the others unchanged. Accordingly, the closed world assumption 
is convenient: the value of any model query s G Sc that is not mentioned in 



51 duplicate (n: INTEGER): LINKEDXIST [G] 

52 A copy of at most 'n' elements 

53 starting at cursor position 

54 require n > 

55 do . . . 

56 ensure 

57 Result.^e^He«c^ = sequence.interval {index, index + n — Y) 

58 Result. mfi^ex = 

59 end 

60 

61 make_empty 

62 Create an empty list 

63 ensure sequence.is-empty and index = 

64 end 

65 . . . 

66 end 
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2 note model: bag 

3 class COLLECTION [G] 

4 Aag: MMLJAG [G] 



16 note model: sequence 

17 class DISPENSER [G] 

18 inherit COLLECTION [G] 



6 isjimpty: BOOLEAN 

7 ensure Result = bag.is.empty end 



20 sequence: MML3EQUENCE [G] 



19 



12 pu/ (v: G) 

13 ensure /jcig = old bag.exlended iv) end 

14 end 



11 



10 



9 wipcout 

ensure bag.is.empty end 



21 

22 invariant 

23 bag.domain = sequence.range 

24 bag.domain.forjall ( agent ix: G): BOOLEAN 

25 bag \x\ = seqiience.occurrences {x) ) 

26 end 



Table 4: Snippets of classes collection (left) and dispenser (right) with model-based 
contracts. 

the postcondition is assumed not to be modified by the command, as if .v = old s 
were a clause of the postcondition. When the closed world assumption is wrong, 
explicit clauses in the postcondition should establish the correct semantics. If a 
command may modify the value of a model query s but the actual new value is 
not known precisely and s is not mentioned in other clauses of the postcondition, 
add a clause relevant is) to the postcondition of the command (in terms of imple- 
mentation, relevant is just a constant function that returns true). If a command 
does not affect the value a model query s but the postcondition of the command 
mentions .v, add a clause i = old i to the postcondition of the command. 

• The postcondition of a query defines the result as a function of its arguments and 
the target object (with the usual discipline of mentioning only abstract values 
of value-bound arguments and target object and possibly actual references to 
reference-bound arguments). Value-bound queries define the abstract state of 
the result, whereas reference-bound queries describe an actual reference to it. 
For example, compare the postcondition of the reference-bound query item from 
class LiNKEDilST (Table O, which precisely defines a reference to the returned 
list element, with the postcondition of the value-bound query duplicate in the same 
class, which specifies the abstract state of the returned list. 

• A clear-cut separation between queries and commands assumes abstract purity 
for all queries: executing a query leaves the abstract state of all its arguments and 
of the target object unchanged. 

3.1.6 Inheritance and model-based contracts. 

A class C" that inherits from a parent class C may or may not re-use C's model queries 
to represent its own abstract state. For every model query sc G Sc of the parent class 
that is not among the heir's model queries Sc, C should provide a linking invari- 
ant to guarantee consistency in the inheritance hierarchy. The linking invariant is a 
formula that defines the value returned by sc in terms of the values returned by the 
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model queries Sc of the inheriting class. This guarantees that the new model is indeed 
a specialization of the previous model, in accordance with the notion of sub-typing 
inheritance. 

A properly defined linking invariant ensures that every inherited feature has a defi- 
nite semantics in terms of the new model. However, the new semantics may be weaker 
in that a command whose contract in the parent class characterized it as a function, 
becomes characterized as a relation in the child class; that is, incompleteness is intro- 
duced (see Section |T2] l. 

Example 4. Consider class COLLECTION in Table H) a generic container of elements 
whose model is a bag. Class DISPENSER inherits from COLLECTION and specializes it 
by introducing a notion of insertion order; correspondingly, its model is a sequence. 
The linking invariant of DISPENSER defines the value of the inherited feature bag in 
terms of the new feature sequence: the domain of bag coincides with the range of sequence 
, and the number of occurrences of any element x in bag correspond to the number of 
occurrences of the same element in sequence. 

The linking invariant ensures that the semantics of features is.empty and wipe.outis 
unambiguously defined also in DISPENSER. On the other hand, the model-based contract 
of command put in COLLECTION and the linking invariant are insufficient to characterize 
the effects of put in DISPENSER, as the position within the sequence where the new 
element is inserted is irrelevant for the bag. 

3.2 Completeness of contracts 

The notion of completeness for the specification of a class gives an indication of how 
accurate are the contracts of that class with respect to the model. An incomplete con- 
tract does not fully capture the effects of a feature, suggesting that the contract may be 
more detailed or, less commonly, that the model of the class — and hence its interface 
— is not abstract enough. Unlike the notion of sufficient completeness for algebraic 
specifications lilTI — that serves a similar purpose — , the present definition of com- 
pleteness is structurally similar to the concept of completeness for a set of axioms, and 
a dual notion of soundness complements it. For simplicity, the following definitions do 
not mention feature arguments; introducing them is, however, routine. 

3.2.1 Soundness and completeness of a model-based contract. 

Let / be a feature of class C. The specification of / denotes two predicates pre^ and 
posty. pre^ represents the set of objects of class C that satisfy the precondition. If 
/ is a query returning object of class T, post^^ has signature C x T and denotes the 
pairs of target and returned objects. If / is a command, post^ has signature C x C 
and denotes the pairs of target objects before and after executing the commandU 

• The precondition of a feature / (query or command) is sound iff: for every 
oi, 02 : C such that oi 02 it is prey(oi) <^ prey(o2)0 

'These definitions imply the absence of side-effects in evaluating assertions. 
''Completeness of preconditions is not an interesting notion and hence it is not defined. 
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• The postcondition of a command m is sound iff: for every o, o'l ^o'^'- C such that 
pre„(o) and o\ Xc o'^ it is post,„(o, o'l) <^ post„(o, o^). 

The postcondition of a command m is complete iff: for every o, o'j^, : C such 
that pre„j(o), post„j(o, o'^), and post,„(o, Oj) it is o'^ 03. 

• The postcondition of a value-bound query q is sound iff: for every o : C and 
^1,^2 : 7" such that pre^(o) and i2 it is postg(o, ti) <^ postg(o, ^2)- 

The postcondition of a value -bound query q is complete iff: for every o : C and 
^1,^2 : 7" such that pre^(o), post^(o, ti), and post^(o, it is ii 'x.t ^2- 

• The postcondition of a reference-bound query q is sound iff: for every o : C and 
^1,^2 : 2^ such that preg(o) and =t ^2 it is post^(o, ti) <^ post^(o, ^2)- 

The postcondition of a reference-bound query q is complete iff: for every o : C 
and ^1,^2 : 7" such that pre^(o), post^(o, ti), and postg(o, ^2) it is ii =t ^2- 

Informally, a sound assertion is one that is consistent with the notion of equivalence 
that is appropriate: sound postconditions of commands and value-bound queries do 
not distinguish between objects with the same abstract state; sound postconditions of 
reference-bound queries do not distinguish between aliases]! 

A postcondition is complete if all the pairs of objects that satisfy it are equivalent 
(according to the right model of equivalence). This means that the complete postcon- 
dition of a command defines the effects of the command as a mathematical /Mncf/on (as 
apposed to a relation) from the prestate to the abstract poststate. Similarly, the com- 
plete postcondition of a query defines the result as a function of the abstract state of 
value-bound arguments and of actual references to reference-bound arguments. 

Example 5. The contracts of features is.empty, wipe.out, and put in class COLLECTION 
(Table nil are sound and complete; the postcondition of put, in particular, is complete 
as it defines the new value of bag uniquely. In the heir class DISPENSER, however, 
the inherited postcondition of put becomes incomplete: the linking invariant does not 
uniquely define sequence from bag, hence inequivalent sequences (for example, one with 
V inserted at the beginning and another one with v at the end) satisfy the postcondition. 

3.2.2 Soundness and completeness in practice. 

As the previous example suggests, reasoning informally — but precisely — about 
soundness and completeness of model-based contracts is often straightforward and 
intuitive, especially if the guidelines of Section 13.11 have been followed. Complete- 
ness captures the uniqueness of the (abstract) state described by a postcondition, hence 
query postconditions in the form Result = exp (.v, a) or Result.* = exp (s, a) and command 
postconditions in the form .v = exp (old s, a) — where exp is a side-effect free expression, * 
denotes the value returned by the model query of some argument, and a is a reference- 
bound argument — are painless to check for completeness. 

^Postconditions of argumentless reference-bound queries are trivially sound for sensible definitions of 
reference equality. 
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Example 6. Consider the following example, from class ARRAY whose model is a 
map. 

2 fill (v. G ;l,u: INTEGER) Put 'v' at all positions in [T, 'u']. 

3 require map.domain [/] and map.domain [u\ 

4 ensure map.domain = old map.domain 

5 ( map I {MMLJNT.SET} [[/, m]] ).isjc:onstant (v) 

6 ( map I (map.domain - {MML.INT.SET} [[/, «]]) ) = 

7 old ( map I (map.domain - {MMLJNT.SET} [[/, «]]) ) 

8 end 

Pre and postconditions are sound because they both refer only to model queries, or 
functions thereof. The following reasoning shows that the postcondition is also com- 
plete: a map is uniquely defined by its domain and by a value for every key in the 
domain. The first clause of the postcondition defined the domain completely. Then, 
let k be any key in the domain. If fc G [l,u] then the second clause defines nmp (k)= v; 
otherwise k ^ [l,u], and the third clause postulates mapik) unchanged. 

Soundness is a mandatory requirement for pre and postconditions in the presence 
of model-based contracts, as it boils down to writing contracts that are consistent with 
the chosen level of information hiding. 

On the other hand, how useful is completeness in practice? As a norm, complete- 
ness is a valuable yardstick to evaluate whether the contracts are sufficiently detailed. 
This is not enough to guarantee that the contracts are correct — and meet the origi- 
nal requirements — but the yardstick is serviceable methodologically to focus on what 
a routine really achieves and how that is related to the abstract model. As a result, 
inconsistencies in specifications are less likely to occur, and the impossibility of sys- 
tematically writing complete contracts is a strong indication that the model is incorrect, 
or the implementation is faulty. Either way, a warning is available before attempting a 
correctness proof. 

While complete postconditions should be the norm, there are recurring cases where 
incomplete postconditions are unavoidable or even preferable. Three major sources of 
benign incompleteness are the following. 

• Inherently nondeterministic or stochastic specifications. For example, a class 
for random number generation can use a sequence as model, but its specification 
should not define the precise content of the sequence unambiguously. 

• Usage of inheritance to factor out common parts of (complete) specifications. 
For example, class DISPENSER in Table 2] is a common ancestor of STACK and 
QUEUE. If its interface includes features item, put and remove, its model must be 
isomorphic to a sequence. Then, it becomes impossible to write a complete post- 
condition for put in DISPENSER: the specification of put cannot define precisely 
where an element is added to the sequence; a choice compatible with the seman- 
tics of STACK will be incompatible with QUEUE and vice versa. 

• Imperfections in information hiding. For example, class ARRAYEDXIST is an 
array-based implementation of lists which exports a query capacity returning the 
size of the underlying array; this piece of information is then part of the model of 
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2 note mappedjo: "Sequence G" 

3 class MML_SEQUENCE [G] 

4 . . . 

5 extended {x: G): MML_SEQUENCE[G] 

6 CuiTent sequence extended with 'x' at the end 

7 note mappedjo: "Sequence .extended(Current, x)" 

8 do ... end 

9 end 



9 type Sequence 7" = [int ] T ; 

10 function Sequence. extended (T) (Sequence T", r) 

11 returns (Sequence?); 

12 axiom (V (T) s: Sequence T, x:T • {Sequence.extended(s 

} 

13 Sequence.extended(s, x) — ^ [Sequence.coHnf(5) + l := 

14 axiom (V {T) s: Sequence T,x:T» 

15 {Sequence.count(Sequence.extended{s, x)}} 

16 Seqiience.co«H/( Sequence. .9, x}) — 

17 Sequence. co«n/(j) + l); 

18 . . . 



Table 5: Snippets from class mml.sequence (left) and the corresponding Boogie theory 
(right). 

the class. Default constructors set capacity to an initial fixed value. Their postcon- 
ditions, however, do not mention this default value, hence they are incomplete. 
The rationale behind not revealing this information is that clients should not rely 
on the exact size of the array when they invoke the constructor. 

In aU these cases, reasoning about completeness is still likely to improve the under- 
standing of the classes and to question constructively the choices made for interfaces 
and inheritance hierarchies. 

3.3 Verification: proofs and runtime checking 

This subsection outlines the main ideas behind using model-based contracts for verifi- 
cation with formal correctness proofs and with runtime checking for automated testing. 
Its goal is not to detail any particular proof or testing technique, but rather to sketch 
how to express the semantics of model-based contracts within standard verification 
frameworks. 

3.3.1 Proofs. 

The axiomatic treatment of model classes |l4l|23]|6| is quite natural: the semantics of a 
model class is defined directly in terms of a theory expressed in the underlying proof 
language, rather than with "special" contracts. The mapping is often straightforward, 
and has the advantage of reusing theories that are optimized for effective usage with 
the proof engine of choice. In addition, the immutability (and value semantics) of 
model classes makes them very similar to mathematical structures and facilitates a 
straightforward translation into mathematical theories. 

In this respect, we are currently developing an accurate mapping of model classes 
and model-based contracts into Boogie [2|. First, the mapping introduces axiomatic 
definitions of MML model classes as Boogie theories; annotations in the form note 
mapped.to connect MML classes to the corresponding Boogie types. For example. Table 
|5] shows how a portion of the MML.SEQUENCE model class translates into a Boogie 
theory: a mapping type [int] r represents sequences of elements of generic type T, and 
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a few axioms constrain a function Sequence. extended to return values in accordance with 
the MML semantic of feature extended. 

Then, each model query in a class with model-based contracts maps to a Boogie 
function that references a representation of the heap; some axioms connect the value 
returned by the function to other features in the translated class. For example, the 

model query sequence in LINKEDXIST becomes function LinkedList.ie9«ence(HeapType, re/) 
returns (Sequence ref). 

Finally, model-based contracts are translated into Boogie formulas according to 
the mapped.to annotations in model classes. For example, the postcondition clause: 

sequence = old (sequence .front {nu]ex). extended (v)+ sequence .tail (index +1)) of putj-ight in 

LINKEDXIST (Table |2ll maps to the Boogie formula: 

lAnkedlAA. sequence(Heap, Current) = Sequence. conca/ ( Sequence .extended ( 
Sequence ./rodf (UmkedlAfA . sequence(o\A(Heap), Current), 

LinkedList . mJe:c(old(//eap), Current)), v ), 
Sequence. to// ('L\nkeAlAf,t.sequence(o\A(Heap), Current), 

LinkedList . mJeA:(old(//eap), Current) + 1) ); 

3.3.2 Runtime checking and testing. 

Most model classes represent_^n!fe mathematical objects, such as sets of finite cardinal- 
ity, sequences of finite length, and so on. All these classes can have an implementation 
of their operations which is executable in finite time; this supports the runtime checking 
of assertions that reference these model classes. 

Testing techniques can leverage runtime checkable contracts to fully automate the 
testing process: generate objects by randomly calling constructors and commands; 
check the precondition of a routine on the generated objects to filter out valid inputs 
for the routine; execute the routine body on a valid input and check the validity of the 
postcondition on the result; any postcondition violation on a valid input is a fault in the 
routine. 

This approach to contract-based testing has proved extremely effective at uncov- 
ering plenty of bugs in production code fT9l, hence it is an excellent "lightweight" 
precursor to correctness proofs. Contract-based testing, however, is only as good as 
the contracts are; the weak postconditions of traditional DbC, in particular, leave many 
real faults undetected. Runtime checkable model-base contracts can help in this respect 
and boost the effectiveness of contract-based testing by providing more expressive, and 
complete, specifications. Section|4]describes some testing experiments that support this 
claim. 

3.3.3 Consistency of tests and proofs. 

Using contract-based testing as a precursor to correctness proofs poses the problem 
of consistency between two semantics given to model classes: the runtime semantics 
given by an executable implementation and the proof semantics given by a mapping to 
a logical theory. Under reasonable assumptions about the execution environment, con- 
sistency must ensure that a component is proven correct against its model-based speci- 
fication if and only if testing the component never detects a violation of its model-based 
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contracts. Establishing this consistency amounts to proving that: (1) the implementa- 
tion of each model class is consistent with the mapping of the class to a logical theory; 
and (2) the implementation of each model query satisfies its specification. Future work 
will detail and address these problems. 

4 Model-based contracts at work 

This section describes experiments in developing model-based contracts for real object- 
oriented software written in Eiffel. The experiments target two non-trivial case studies 
based on data-structure libraries (described in Section l^TI) with the goal of demonstrat- 
ing that deploying model-based contracts is feasible, practical, and useful. Section|42] 
discusses the successes and limitations highlighted by the experiments. 

4.1 Case studies 

The first case study targeted EiffelBase |9|, a library of general-purpose data struc- 
tures widely used in Eiffel programs; EiffelBase is representative of mature Eiffel 
code exploiting extensively traditional DbC. We selected 7 classes from EiffelBase, 
for a total of 304 features (254 of them are public) over more that 5700 lines of code. 
The 7 classes include 3 widely used container data structures (ARRAY, ARRAYEDXIST, 
and LINKEDXIST) and 4 auxiliary classes used by the containers (INTEGERJNTERVAL, 
LINKABLE, ARRAYEDilST.CURSOR, and LiNKEDilST.CURSOR). Our experiments sys- 
tematically introduced models and conservatively augmented the contracts of all public 
features in these 7 classes with model-based specifications. 

The second case study developed EiffelBase2, a new general-purpose data struc- 
ture library. The design of EiffelBase2 is similar to that of its precursor EiffelBase; 
EiffelBase2, however, has been developed from the start with expressive model-based 
specifications and with the ultimate goal of proving its full functional correctness — 
backward compatibility is not one of its primary aims. This implies that EiffelBase2 
rediscusses and solves any deficiency and inconsistency in the design of EiffelBase that 
impedes achieving full functional correctness or hinders the full-fledged application of 
formal techniques. EiffelBase2 provides containers such as arrays, lists, sets, tables, 
stacks, queues, and binary trees; iterators to traverse these containers; and comparator 
objects to parametrize containers with respect to arbitrary equivalence and order rela- 
tions on their elements. The current version of EiffelBase2 includes 46 classes with 
460 features (403 of them are public) totaling about 5800 lines of code; these figures 
make EiffelBase2 a library of substantial size with realistic functionalities. The latest 
version of EiffelBase2 is available at |http : / /eif f elbase2 . origo . ethz . ch| 

4.2 Results and discussion 

This section addresses the following questions based on the experience with the two 
case studies of EiffelBase and EiffelBase2. 

• How many different model classes are needed to write model-based contracts? 
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2 note model: set, relation 

3 class SET [G] 



16 note model: map 

17 class BINARY.TREE [C] 



4 . . . 



5 /las (v. C): BOOLEAN 

6 Does this set contain 'v'? 

7 ensure 

8 Result = not {set * relation. images/ {\')).is^mpty 

9 end 

10 



IS . . . 

19 add^root (v: G) 

20 Add a root witli value 'v' to an empty tree 

21 require map.isjempty 

22 ensure tnap.coutit = 1 and map [Empty] = v 

23 end 



11 set: MML.SET [G] Tlie set of elements 

12 relation: MMLJIELATION [G, G] 

13 Equivalence relation on elements 

14 end 



24 



26 

27 end 



25 



map: MMLJVIAP [MML JEQUENCE[BOOLEAN], G] 
Map of paths to elements 



Table 6: Examples of nonobvious models: classes set and binary.tree from Eiffel- 
Base2. 

• How many contracts can be complete? 

• Do executable accurate model-based contracts boost contract-based testing? 
4.2.1 How many model classes? 

Model-based contracts for EiffelBase used model classes for Booleans, integers, refer- 
ences, (finite) sets, relations, and sequences. EiffelBase2 additionally required (finite) 
maps, bags, and infinite maps and relations for special purposes (such as modeling 
comparator objects). These figures suggest that a moderate number of well-understood 
mathematical models suffices to specify a general-purpose library of data structures. 

Determining to what extent this is generalizable to software other than libraries 
of general-purpose data structures is an open question which belongs to future work. 
Domain-specific software may indeed require complex domain-specific model classes 
(e.g., real-valued functions, stochastic variables, finite-state machines), and applica- 
tion software that interacts with a complex environment may be less prone to accurate 
documentation with models. However, even if writing model-based contracts for such 
systems proved exceedingly complex, some formal model is required if the goal is for- 
mal verification. In this sense, focusing model-based contracts on library software is 
likely to have a great payoff through extensive reuse: the many clients of the reusable 
components can rely on expressive contracts not only as detailed documentation but 
also to express their own contracts and interfaces by combining a limited set of well- 
understood, highly dependable components. 

Another interesting remark is that the correspondence between the limited number 
of model classes needed in our experiments and the classes using these model classes 
is far from trivial: data structures are often more complex than the mathematical struc- 
tures they implement. Consider, for example, class SET in Table|6l EiffelBase2 sets are 
parameterized with respect to an equivalence relation, hence the model of SET is a pair 
of a mathematical set and a relation. Another significant example is BINARY.TREE (also 
in Table|6|: instead of introducing a new model class for trees or graphs, BINARY.TREE 
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2 mergej-ight (other. LINKEDXIST [G]) 

3 Merge 'other' into current list after cursor position. Do not move cursor. Empty 'other'. 

4 do 

5 ... 

6 Other Jirst_element := other. first .element ; other_count := other.count ; other.wipe_out 

7 if before then first^lement := other Jirstjelement ; active := first^element 

8 else . . . end 

9 count := count + other^count 

10 ensure 

11 Original contract 

12 count = old count + old other.count ; inde.x = old inde.x ; other.is_empty 
13 Model based contract 

14 sequence = old {sequence. front {index) + other. sequence + sequence.tail (index + 1)) 

15 end 

Table 7; Faulty routine mergcrtght from class linkedxist. 

concisely represents a tree as a map of paths to values; the model of a path is in turn a 
sequence of Booleans. 

4.2.2 How many complete contracts? 

Reasoning informally, but rigorously, about the completeness of postconditions — 
along the lines of Section [X2l — proved to be straightforward in our experiments. Only 
18 (7%) out of 254 public features in EiffelBase with model-based contracts and 17 
(4%) out of 403 public features in EiffelBase2 have incomplete postconditions. All of 
them are examples of "intrinsic" incompleteness mentioned at the end of Section [l!2l 
EiffelBase2, in particular, was designed trying to minimize the number of features with 
intrinsically incomplete postconditions. 

These results indicate that model-based contracts make it feasible to write system- 
atically complete contracts; in most cases this was even relatively straightforward to 
achieve. Unsurprisingly, using model-based contracts dramatically increases the com- 
pleteness of contracts in comparison with standard DbC. For example, 42 (66%) out of 
64 public features of class LIST in the original version of EiffelBase (without model- 
based contracts) have incomplete postconditions, including 20 features (3 1 %) without 
any postcondition. 

4.2.3 Contract-based testing with model-based contracts. 

The standard EiffelBase library has been in use for many years and has been exten- 
sively tested, both manually and automatically. Are the expressive contracts based on 
models enough to boost automated testing finding new, subtle bugs? While prelimi- 
nary, our experiments seem to answer in the affirmative. Applying the Auto Test testing 
framework llT9l on EiffelBase with model-based contracts for 30 minutes discovered 
3 faults; none of them would have been detectable with standard contracts. Running 
these tests did not require any modification to AutoTest or model classes, because the 
latter include an executable implementation. 

The 3 faults reveal subtle mistakes that have gone undetected so far For example, 
consider the implementation of routine merge.right in Table|7j the routine merges a hnked 
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list other into the current linked list at the cursor position by modifying references in 
the chain of elements. The then branch of the if statement (line 6) deals with the special 
case where the cursor in the current Ust is before the first element; in this case the first 
element of the current list (first.element) will point directly to the first element of the 
other list iotherjirst.element). This is not sufficient, as the routine should also link the end 
of the other list to the front of the current one, otherwise all elements in the current 
list become inaccessible. The original contract does not detect this fault; the clause 
count = old count + old other. count is in particular satisfied as count is updated anyway (line 
8), but its value does not reflect the actual content of the new list. On the contrary, the 
complete model-based contract (line 13) specifies the desired configuration of the list 
after executing the command, which leads to easily detecting the error 

5 Related work 

Every fully formal specification ultimately boils down to a mathematical model, and 
the research on formal modeling and analysis is so extensive and diverse that it cannot 
be summarized concisely. This section focuses on a few major approaches to the formal 
specification of object-oriented abstract data types that adopt a stance similar to that of 
the present paper: using highly expressive mathematical models geared towards the 
full functional correctness specification (and verification) of complex data structures. 

Hoare pioneered the usage of mathematical models to define and prove correctness 
of data type implementations |13|. This idea spawned much related work, which can 
be roughly partitioned in three major lines: algebraic notations, descriptive notations, 
and design-by-contract approaches. The following subsections shortly summarize the 
main features of each of these techniques; then. Section \5A\ describes the approaches 
based on mathematical models that are closest to the present paper 

5.1 Algebraic notations 

Algebraic notations formalize classes in terms of (uninterpreted) functions and axioms 
that describe the mutual relationship among the functions. For example, the axiom 
s.insert(a;).member_of (a;) — True defines the mutual semantics of the operations 
insert and member_of of a set data type. The most influential work in algebraic speci- 
fications is arguably Guttag and Horning's [ 11 1 and Gougen et al.'s 1 10|, which gave a 
foundation to much derivative work. The former was also made practical in the Larch 
project 1 12], and introduced a notion of completeness that differs from the one of the 
present paper (see Section [372l ). and applies to whole types, not single features. 

Algebraic notations emphasize the calculational aspect of a specification. This 
makes them very effective notations to formalize and verify data types at a high level 
of abstraction. In particular, the close connection between rewriting systems 13 and al- 
gebraic definitions enables, in many practical cases, the automated or semi-automated 
verification of consistency and completeness 1 1 1 1 requirements of abstract specifica- 
tions. The algebraic approach, on the other hand, does not integrate as well with real 
programming languages to document implementations in the form of pre and postcon- 
ditions of single operations. 
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5.2 Descriptive notations 



Descriptive notations formalize classes in terms of simpler types — ultimately grounded 
in simple mathematical models such as sets and relations — and operations defined as 
input/output relations (that is, pre and postconditions) constrained by logic or arith- 
metic formulas. For example, the insert operation of a set data structure could be de- 
fined by the formula Vs, x • |s.insert(a;)] — \s\ U {x], in terms of the union operation 
applied to a model set |s] . 

Descriptive notations can be used in isolation to build language-independent mod- 
els, or to give a formal semantics to concrete implementations. Languages and meth- 
ods such as Z ||25]| , B 111, and VDM |14| pursue the former approach, usually within 
a top-down development framework. Other specification languages and tools such as 
RESOLVE |20|, AAL 1 15 1, and Jahob [26| are examples of the latter approach for the 
programming languages and Java. 

Descriptive notations are apt to develop correct-by-construction designs and to ac- 
curately document implementations, often with the goal of verifying functional cor- 
rectness. Using them in contracts, however, introduces a new notation on top of the 
programming language, which requires additional effort and expertise from the pro- 
grammer and makes it more difficult to to maintain the specification synchronized with 
the actual implementation. This weakness is shared by algebraic notations alike. 

5.3 Design-by-contract approaches 

Design by contract |18 | introduces formal specifications in programs using the same 
notation for implementation and annotations, in an attempt to make writing the con- 
tracts as congenial as possible to programmers. The Eiffel programming language 
[8 1 epitomizes the design by contract methodology, together with similar solutions for 
other languages such as APP [22J for C, Spec'^ |j2J for C^, and many others. 

As we discussed also in the rest of the paper, using a subset of the programming lan- 
guage in annotations helps programmers writing them [3 1, but it often does not provide 
enough expressive power to formalize (easily) "complete" functional correctness, or 
requires cumbersome workarounds to capture the semantics of mathematical concepts 
in terms of programming language constructs. 

5.4 Model-based annotation languages 

The Java Modeling Language (JML) flTl fTBl is likely the approach that shares the most 
similarities with ours: JML annotations are based on a subset of the Java programming 
language and the JML framework provides a library of model classes mapping mathe- 
matical concepts. While sharing a common outlook, the approaches in JML and in the 
present paper differ in several details pertaining scope and technical aspects. 

At the technical level, JML prefers model variables 15] while our approach lever- 
ages model queries that return the value of immutable model classes; each approach has 
its merits, but model queries have the advantage of supporting an axiomatic definition 
that is easily grounded in an underlying mathematical theory, and facilitate a seamless 
integration with traditional contracts — also typically based on queries. Section [XTI 
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discusses other advantages of model queries. A notational difference is that JML ex- 
tends Java's expressions with notations for logic operators and quantifiers, while our 
method does not extend Eiffel's syntax and reuses notation such as agents to express 
quantifications and other aspects that belong to expressive specifications. 

In terms of scope, our approach strives to be more methodological and systematic, 
with the primary target of fully contracting a complete library of data structures. Our 
method tries to keep the additional effort required to the programmer to a minimum. 
Finally, let us remark that our usage scenarios are multi-faceted, ranging from spec- 
ification and design (also supporting notions such as completeness), to verification, 
runtime checking, and automated testing. 

The present paper extends in scope the previous work of ours on model-based 
classes ll24l |23I. and systematically applies the results to the re-design and re-imple- 
mentation of a rich library of data structures. The experience gained in this practical 
application also prompted us to refine and rediscuss aspects of the previous approach, 
as we discussed at length in the rest of the paper. 

6 Conclusions and future work 

The present work introduces a methodology to write strong interface specifications for 
reusable object-oriented components. The methodology is soundly based on expressive 
models based on mathematical notions and features a notion of specification complete- 
ness which is formal, yet easy to reason about. The application of the methodology to 
the development of a library of general-purpose data structures demonstrates its prac- 
ticality and its many uses in analysis, design, and verification. 

Future work includes short- and long-term goals. Among the former, we plan to 
apply model-based contracts to more real-life examples, including application software 
from diverse domains. A user study will try to confirm the preliminary evidence that 
model-based contracts are easy to write, understand, and reason about informally. 

Longer term work will integrate model-based contracts within a comprehensive 
verification environment. This will require, in particular, significant developments in 
the techniques for proofs and tests with model-based contracts. Work on proofs will 
include dealing systematically with the frame problem and extensions of the model- 
based contract methodology to non-public features, including abstraction functions, 
representation invariants, and loop invariants. Work on testing will focus on optimizing 
the runtime performance of model classes. 
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